High Data-Rate Processing System

ABSTRACT

A data processing system includes a hub processing portion, and a first plurality of processing resources communicatively connected to define a first ring, wherein each processing resource of the first plurality of processing resources is communicatively connected to the hub processing portion.

BACKGROUND

The present disclosure relates generally to data processing and morespecifically to processing architectures with high data-rate processing.

Processing systems often include numerous processing resources thatreceive packets of data and processing instructions. The processingsystems may include different processing resources having differentfunctions and capabilities. Thus, some data processing tasks may includethe use of numerous processors to perform portions of the processingtasks.

The transmission of data between the processing resources may be limitedby the bandwidth of the connections between the processing resources.The limitations in bandwidth may reduce the overall processingperformance of the systems.

SUMMARY

According to one embodiment of the present invention, a data processingsystem includes a hub processing portion having a, point-to-point dataswitching portion, a first processing resource having an direct memoryaccess (DMA) data communication portion communicatively connected to thepoint-to-point data switching portion of the hub processing portion, asecond processing resource having an DMA data communication portioncommunicatively connected to the point-to-point data switching portionof the hub processing portion and the DMA data communication portion ofthe first processing resource, a third processing resource having an DMAdata communication portion communicatively connected to thepoint-to-point data switching portion of the hub processing portion andthe DMA data communication portion of the second processing resource,and a fourth processing resource having an DMA data communicationportion communicatively connected to the point-to-point data switchingportion of the hub processing portion, the DMA data communicationportion of the third processing resource and the DMA data communicationportion of the first processing resource.

According to another embodiment of the present invention, a dataprocessing system includes a hub processing portion, and a firstplurality of processing resources communicatively connected to define afirst ring, wherein each processing resource of the first plurality ofprocessing resources is communicatively connected to the hub processingportion.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of this disclosure, reference is nowmade to the following brief description, taken in connection with theaccompanying drawings and detailed description, wherein like referencenumerals represent like parts:

FIG. 1 illustrates an exemplary embodiment of a data processing system;

FIG. 2 illustrates an alternate exemplary embodiment of a dataprocessing system;

FIG. 3 illustrates a block diagram of an exemplary embodiment of theprocessing resources of the system of FIG. 1;

FIG. 4 illustrates a block diagram of an exemplary embodiment of the hubprocessing portion of the system of FIG. 1;

FIG. 5 illustrates a block diagram of an alternate exemplary embodimentof a data processing system; and

FIG. 6 illustrates a block diagram of an exemplary embodiment of a GPUof FIG. 5.

DETAILED DESCRIPTION

Processing capability continues to increase, and a steadily increasingnumber of individual and group users results in the network traffic thatconnects processers expanding at an ever faster rate. Some computationaltasks use iterative or recursive computations that include iterativeanalysis at various steps in the process. Though the individualcomputations may not use significant processing resources, the iterativenature of the analysis uses data transfer resources, which may reducethe efficiency of the processing system due to data transferbottlenecks. Typical data centers have a processing to data bandwidthratio (P/D) (e.g., GFLOPS/Gwords per second) of about 1000-5000. Forsome processing tasks, this ratio may be too high (i.e. limited by datatransfer rates) as many iterative or recursive types of computationaltasks require P/D ratios of several hundred for each step (i.e. before amajor branch in the computational tasking). Thus, a system thatoptimizes the P/D ratio for these type of tasks is described below usingPeripheral Component Interconnect-express (PCIe) type switches that arearranged on system processing boards for connectivity.

FIG. 1 illustrates an exemplary embodiment of a data processing system(system) 100. The system 100 includes a hub processor element 102 thatincludes an input/output (I/O) portion 104 and a processing portion 106.The I/O portion 104 may include for example, one or more communicationsboards having I/O processing features and connectors. The processingportion 106 may include one or more processors that are communicativelyconnected to each other and to the I/O portion 104. The I/O portion iscommunicatively connected to a data and storage network 101 viaconnections 103 that may include, for example, 10G Ethernet®, 40GEtherenet®, or high speed InfiniBand® connections. In the illustratedembodiment, the processing portion 106 includes two processing boardseach with a Peripheral Component Interconnect-express PCIe type switchthat provides communicative connections 110 a-d directly (i.e. withdirect memory access) between the processors and peripheral devices ofthe processing boards. The PCIe of the processing portion 106 are alsoconnected to the PCIe connections of processing resources 108 a-d (viaPCIe switches).

In this regard, the PCIe connections include on-motherboard closelycoupled high speed point to point packet switches using multiplebi-directional high speed links (e.g., PCIe) to on-motherboard devicesand to a backplane containing board-to-board physical connections ofmultiple of these switches. The links are of a similar type as thoseattaching (from an electrical and signal perspective) directly to theCPU package (e.g., PCIe) to minimize both physical and throughputoverhead associated with translation from one protocol (e.g., PCIedirectly connected to the CPU) to another (e.g., Ethernet from a PCIeconnected network card). This arrangement allows for both the board toboard connections to be referenced, connects the board to board linkswith the ones going to the CPU, and references other on-board devices(since the FPGA or Tilera processing elements mounted to the boardscommunicate to the on-board switch in a similar manner as the mainCPU(s) on the board.

The processing resources 108 a-d each include a processing portion thatincludes one or more processing elements and a PCIe type switch thatprovides a communicative connection to the PCIe connections of theprocessing portion 106, and an another processing resource 108 that iscommunicatively arranged in a “ring A” defined by the processingresources 108 a-d and the connections between the processing resources108 a-d. In this regard, the PCIe switches of each processing resource108 a-d is connected to the PCIe switches of two other processingresources 108 a-d in the ring via the connections 112 a-d, which arecommunicative connections between PCIe type switches. The processingresources 108 e-h are similar to the processing resources 108 a-d, andare communicatively arranged in a “ring B.” Each of the processingresources 108 e-h is connected to the PCIe switches of three otherprocessing resources 108 via on-board PCIe type switches. In thisregard, the ring B includes the processing resources 108 e-h and thecommunicative connections 112 e-h. Each of the processing resources 108e-h is connected to one of the processing resources 108 a-d via PCIetype switches by connections 110 e-h. Each of the processing resources108 is communicatively connected to the data and storage network 101 viaconnections 105 that may include, for example, 10G Ethernet®, 40GEtherenet®, or high speed InfiniBand® connections.

The processing resources 108 define “branches” that are defined bycommunicative connections arranged in series from the hub processorelement 102. In this regard, a branch I is defined by the connection 110a, the processing resource 108 a, the connection 110 e and theprocessing resource 108 e. The branch II is defined by the connection110 b, the processing resource 108 b, the connection 110 f and theprocessing resource 108 f. The branch III is defined by the connection110 c, the processing resource 108 c, the connection 110 d and theprocessing resource 108 d. The branch IV is defined by the connection110 d, the processing resource 108 d, the connection 110 h and theprocessing resource 108 h.

The connections 110 and 112 provide data flow paths between processingresources 108 and between the processing resources 108 and the hubprocessor element 102. For example, the hub processor element 102 mayreceive a processing task via a connection 103 and the data and storagenetwork 101. The hub processor element 102 may perform some processingof the processing task and send the task or portions of the task to theprocessing resource 108 d. The processing resource may perform aprocessing task and send the results and a related processing task tothe processing resource 108 f via any available transmission path (e.g.,connection 112 d; processing resource 108 a; connection 110 e;processing resource 108 e; connection 112 e; and via the data andstorage network 101). The processing resource 108 f may send output tothe data network and SAN 101 via a connection 105, or may send theoutput to the hub processor element 102 that may send the output to thedata and storage network via a connection 103, or may perform or directadditional processing via a processing resource 108.

The topological configurations described herein allows for aminimization of bottlenecks of data flows since each of the connectionsare approximately similar speeds. Such an arrangement achieves highefficiency for data processing tasks that involve significant datatransfer and iterative or recursive aspects. In this regard, theprocessing resources 108 may not be identical or similar, for example,the processing resource 108 a may be optimized for one type ofprocessing (e.g., a graphical processing unit(s) for mathematical matrixcomputations), while the processing resource 108 b may be optimized foranother type of processing (e.g., a field programmable gate array(s) fordigital signal processing tasks). Thus, the systems described hereinallow for data to be moved efficiently between processing resources 108such that a processing resource 108 that is optimized or designed toefficiently perform a particular processing task may efficiently receivethe data and perform the task rather than retaining the data at aprocessing resource 108 that is less efficient with regard to aparticular desired processing task.

The connections 110 and 112 of the illustrated exemplary embodimentinclude 8 GB/s (total bidirectional peak theoretical rate on each of thelinks, e.g., 112 a may be 8 GB/s and 112 b may be 8 GB/s) data flowrates, however any suitable data flow rate may be used to increase theefficiency of the system 100. Any number of additional “rings” andbranches may be added to increase the processing capabilities of thesystem without reducing the data flow rate between elements. In thisregard, FIG. 2 illustrates an alternate exemplary embodiment of a system200 that includes a hub processing portion 102 and three rings (A-C) andeight branches (I-VIII) of processing resources 108 (processing nodes)that are connected by connections 110 and 112 between PCIe switches in asimilar manner as described above. As additional rings are added,additional branches may be added to maintain the data flow rates betweenthe processing resources 108 and the hub processing portion 102.

FIG. 3 illustrates a block diagram of an exemplary embodiment of theprocessing resources 108 a and 108 e of the system 100 (of FIG. 1). Eachof the processing resources 108 includes a PCIe type switch portion 302,processor portions with I/O connections 304, a processor portion 306,and a field programmable gate array (FPGA) portion 308; each of whichare connected to the PCIe type switch portion 302.

FIG. 4 illustrates a block diagram of an exemplary embodiment of the hubprocessing portion 102. The hub processing portion includes the I/Oportion 104 and the processing arrangement portion 106. The processingarrangement portion 106 includes a first processing component 402 a anda second processing component 402 b that each include processingelements 404 that have PCIe connections that are communicativelyconnected to a PCIe type switch portion 406 in addition to a separateconnection directly between the two processing elements on a singleboard. The processing components 402 a and b include FPGA portions 408that are communicatively connected to the PCI type switch portion 406.The FPGA portions 408 may include, for example firmware to effect thePCIe root complex address translation (i.e. implementation of a PCIenon-transparent bridge). Such firmware enables this configuration tooperate similar to a meshed network rather than a master with an arrayof slave devices which would have more limited data exchange capabilityand greater overhead. In this regard, each element is its own rootcomplex and element to element connections are provided throughswitches. For example, processing resource 108 f communicating withprocessing resource 108 d may use the switch on processing resource 108a. If processing resource 108 a is communicating with the processingresource 108 g at the same time, the processing resource 108 a-108 dlink would be used twice.

The I/O portion 104 includes a first I/O component 401 a and a secondI/O component 401 b that each include a PCIe type switch 403 that iscommunicatively connected to a FPGA portion 405, a processing element407, and I/O elements 409 that may include, for example, FPGAs and/or anadditional processor that performs I/O or other types of processing.

FIG. 5 illustrates a block diagram of an alternate exemplary embodimentof a data processing system 500, the system 500 is similar to the system100 (of FIG. 1) described above and includes graphics processing units(GPU) 502 a-d that are communicatively connected to the PCIe connectionsof corresponding processing resources 108 e-h with PCIe type switchesvia connections 110 i-L. The GPU units 502 a-d may also be connected tothe PCIe connections of the hub processor element 102 with PCIe typeswitches via connections 510 a-d.

FIG. 6 illustrates a block diagram of an exemplary embodiment of a GPU502 a. In this regard, the GPU 502 a includes GPU processing elements602 that are communicatively connected to a PCIe type switch 604.

Though the illustrated embodiments described above include PCIe typeswitches, that may include, for example, any type of PCIe device capableof implementing multiple point-to-point data paths and providepacket-switched data exchange between these paths, alternate embodimentsmay include any other types of switching devices and/or connectionphysical links, protocols, and methods that facilitate connectionsbetween the direct (i.e. not through a chipset based IO controller) datapaths of processing elements.

While the disclosure has been described with reference to a preferredembodiment or embodiments, it will be understood by those skilled in theart that various changes may be made and equivalents may be substitutedfor elements thereof without departing from the scope of the disclosure.In addition, many modifications may be made to adapt a particularsituation or material to the teachings of the disclosure withoutdeparting from the essential scope thereof. Therefore, it is intendedthat the disclosure not be limited to the particular embodimentdisclosed as the best mode contemplated for carrying out thisdisclosure, but that the disclosure will include all embodiments fallingwithin the scope of the appended claims.

What is claimed is:
 1. A data processing system comprising: a hubprocessing portion having a, point-to-point data switching portion; afirst processing resource having an direct memory access (DMA) datacommunication portion communicatively connected to the point-to-pointdata switching portion of the hub processing portion; a secondprocessing resource having an DMA data communication portioncommunicatively connected to the point-to-point data switching portionof the hub processing portion and the DMA data communication portion ofthe first processing resource; a third processing resource having an DMAdata communication portion communicatively connected to thepoint-to-point data switching portion of the hub processing portion andthe DMA data communication portion of the second processing resource;and a fourth processing resource having an DMA data communicationportion communicatively connected to the point-to-point data switchingportion of the hub processing portion, the DMA data communicationportion of the third processing resource and the DMA data communicationportion of the first processing resource.
 2. The system of claim 1,further comprising: a fifth processing resource having an DMA datacommunication portion communicatively connected to the DMA datacommunication portion of the first processing resource; a sixthprocessing resource having an DMA data communication portioncommunicatively connected to the DMA data communication portion of thesecond processing resource and the DMA data communication portion of thefifth processing resource; a seventh processing resource having an DMAdata communication portion communicatively connected to the DMA datacommunication portion of the third processing resource and the DMA datacommunication portion of the sixth processing resource; and an eighthprocessing resource having an DMA data communication portioncommunicatively connected to the DMA data communication portion of thefourth processing resource, the DMA data communication portion of theseventh processing resource and the DMA data communication portion ofthe fifth processing resource.
 3. The system of claim 1, wherein each ofthe DMA data communication portions are connected via a peripheralcomponent interconnect-express (PCIe) switch portion.
 4. The system ofclaim 1, wherein the hub processing portion includes an input/output(I/O) portion communicatively connected to a data network.
 5. The systemof claim 1, wherein the first processing resource is communicativelyconnected to a data network with a first communicative link, the secondprocessing resource is communicatively connected to a data network witha second communicative link, the third processing resource iscommunicatively connected to a data network with a third communicativelink, and the fourth processing resource is communicatively connected toa data network with a fourth communicative link.
 6. The system of claim2, wherein the fifth processing resource is communicatively connected toa data network with a fifth communicative link, the sixth processingresource is communicatively connected to a data network with a sixthcommunicative link, the seventh processing resource is communicativelyconnected to a data network with a seventh communicative link, and theeighth processing resource is communicatively connected to a datanetwork with a eighth communicative link.
 7. The system of claim 1,wherein the hub processing portion comprises: an I/O portion having aplurality of I/O processing elements communicatively connected to afirst PCIe switch; and and a processing arrangement portion having aprocessing element communicatively connected to a second PCIe switch,the second PCIe switch communicatively connected to the first PCIeswitch.
 8. The system of claim 1, wherein each of the processingresources includes a processing element and an I/O elementcommunicatively connected to a PCIe switch.
 9. The system of claim 1,further comprising: a first graphics processing unit (GPU) portioncommunicatively connected through a PCIe switch to the PCIe switchportion of the fifth processing resource and the PCIe switch portion ofthe hub processing portion; a second GPU portion communicativelyconnected through a PCIe switch to the PCIe switch portion of the sixthprocessing resource and the PCIe switch portion of the hub processingportion; a third GPU portion communicatively connected through a PCIeswitch to the PCIe switch portion of the seventh processing resource andthe PCIe switch portion of the hub processing portion; and a fourth GPUportion communicatively connected through a PCIe switch to the PCIeswitch portion of the eighth processing resource and the PCIe switchportion of the hub processing portion.
 10. A data processing systemcomprising: a hub processing portion; and a first plurality ofprocessing resources communicatively connected to define a first ring,wherein each processing resource of the first plurality of processingresources is communicatively connected to the hub processing portion.11. The system of claim 10, further comprising a second plurality ofprocessing resources communicatively connected to define a second ring,wherein each processing resource of the second plurality of processingresources is communicatively connected to a corresponding processingresource of the first plurality of processing resources.
 12. The systemof claim 10, further comprising a third plurality of processingresources communicatively connected to define a third ring, wherein eachprocessing resource of the third plurality of processing resources iscommunicatively connected to a corresponding processing resources of thesecond plurality of processing resources.
 13. The system of claim 10,wherein the first plurality of processing resources communicativelyconnected to define the first ring are connected via PCIe switchportions of the processing resources of the first plurality ofprocessing resources, and each processing resource of the firstplurality of processing resources is communicatively connected to a PCIeswitch portion of the hub processing portion via the PCIe switchportions of the processing resources of the first plurality ofprocessing resources.
 14. The system of claim 11, wherein the secondplurality of processing resources communicatively connected to definethe second ring are connected via PCIe switch portions of the processingresources of the second plurality of processing resources, and eachprocessing resource of the second plurality of processing resources iscommunicatively connected to the corresponding processing resource ofthe first plurality of processing resources via the PCIe switch portionsof the processing resources of the second plurality of processingresources and the PCIe switch portions of the corresponding processingresources of the first plurality of processing resources.
 15. The systemof claim 12, wherein the third plurality of processing resourcescommunicatively connected to define the third ring are connected viaPCIe switch portions of the processing resources of the third pluralityof processing resources, and each processing resource of the thirdplurality of processing resources is communicatively connected to thecorresponding processing resource of the second plurality of processingresources via the PCIe switch portions of the processing resources ofthe third plurality of processing resources and the PCIe switch portionsof the corresponding processing resources of the second plurality ofprocessing resources.
 16. The system of claim 11, further comprising aplurality of graphics processing units (GPUs), wherein each GPU of theplurality of GPUs is communicatively connected to a correspondingprocessing resource of the third plurality of processing resources. 17.The system of claim 16, wherein each GPU of the plurality of GPUs iscommunicatively connected to the hub processing portion.
 18. The systemof claim 10, wherein the hub processing portion is communicativelyconnected to a data network.
 19. The system of claim 10 wherein eachprocessing resource of the first plurality of processing resources iscommunicatively connected to a data network.
 20. The system of claim 11,wherein each processing resource of the second plurality of processingresources is communicatively connected to a data network.